13 research outputs found

    Functional Analysis beyond Enrichment: Non-Redundant Reciprocal Linkage of Genes and Biological Terms

    Get PDF
    Functional analysis of large sets of genes and proteins is becoming more and more necessary with the increase of experimental biomolecular data at omic-scale. Enrichment analysis is by far the most popular available methodology to derive functional implications of sets of cooperating genes. The problem with these techniques relies in the redundancy of resulting information, that in most cases generate lots of trivial results with high risk to mask the reality of key biological events. We present and describe a computational method, called GeneTerm Linker, that filters and links enriched output data identifying sets of associated genes and terms, producing metagroups of coherent biological significance. The method uses fuzzy reciprocal linkage between genes and terms to unravel their functional convergence and associations. The algorithm is tested with a small set of well known interacting proteins from yeast and with a large collection of reference sets from three heterogeneous resources: multiprotein complexes (CORUM), cellular pathways (SGD) and human diseases (OMIM). Statistical Precision, Recall and balanced F-score are calculated showing robust results, even when different levels of random noise are included in the test sets. Although we could not find an equivalent method, we present a comparative analysis with a widely used method that combines enrichment and functional annotation clustering. A web application to use the method here proposed is provided at http://gtlinker.cnb.csic.es

    MARQ: an online tool to mine GEO for experiments with similar or opposite gene expression signatures

    Get PDF
    The enormous amount of data available in public gene expression repositories such as Gene Expression Omnibus (GEO) offers an inestimable resource to explore gene expression programs across several organisms and conditions. This information can be used to discover experiments that induce similar or opposite gene expression patterns to a given query, which in turn may lead to the discovery of new relationships among diseases, drugs or pathways, as well as the generation of new hypotheses. In this work, we present MARQ, a web-based application that allows researchers to compare a query set of genes, e.g. a set of over- and under-expressed genes, against a signature database built from GEO datasets for different organisms and platforms. MARQ offers an easy-to-use and integrated environment to mine GEO, in order to identify conditions that induce similar or opposite gene expression patterns to a given experimental condition. MARQ also includes additional functionalities for the exploration of the results, including a meta-analysis pipeline to find genes that are differentially expressed across different experiments. The application is freely available at http://marq.dacya.ucm.es

    LLM3D: a log-linear modeling-based method to predict functional gene regulatory interactions from genome-wide expression data

    Get PDF
    All cellular processes are regulated by condition-specific and time-dependent interactions between transcription factors and their target genes. While in simple organisms, e.g. bacteria and yeast, a large amount of experimental data is available to support functional transcription regulatory interactions, in mammalian systems reconstruction of gene regulatory networks still heavily depends on the accurate prediction of transcription factor binding sites. Here, we present a new method, log-linear modeling of 3D contingency tables (LLM3D), to predict functional transcription factor binding sites. LLM3D combines gene expression data, gene ontology annotation and computationally predicted transcription factor binding sites in a single statistical analysis, and offers a methodological improvement over existing enrichment-based methods. We show that LLM3D successfully identifies novel transcriptional regulators of the yeast metabolic cycle, and correctly predicts key regulators of mouse embryonic stem cell self-renewal more accurately than existing enrichment-based methods. Moreover, in a clinically relevant in vivo injury model of mammalian neurons, LLM3D identified peroxisome proliferator-activated receptor γ (PPARγ) as a neuron-intrinsic transcriptional regulator of regenerative axon growth. In conclusion, LLM3D provides a significant improvement over existing methods in predicting functional transcription regulatory interactions in the absence of experimental transcription factor binding data

    Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling

    Get PDF
    Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein–protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org

    Pascual-Montano: Finding Closed Frequent Item Sets by Intersecting Transactions. EDBT

    No full text
    Most known frequent item set mining algorithms work by enumerating candidate item sets and pruning infrequent candidates. An alternative method, which works by intersecting transactions, is much less researched. To the best of our knowledge, there are only two basic algorithms: a cumulative scheme, which is based on a repository with which new transactions are intersected, and the Carpenter algorithm, which enumerates and intersects candidate transaction sets. These approaches yield the set of so-called closed frequent item sets, since any such item set can be represented as the intersection of some subset of the given transactions. In this paper we describe a considerably improved implementation scheme of the cumulative approach, which relies on a prefix tree representation of the already found intersections. In addition, we present an improved way of implementing the Carpenter algorithm. We demonstrate that on specific data sets, which occur particularly often in the area of gene expression analysis, our implementations significantly outperform enumeration approaches to frequent item set mining

    Distinct molecular signature of murine fetal liver and adult hematopoietic stem cells identifies novel regulators of hemapoietic stem cell function

    No full text
    During ontogeny, fetal liver (FL) acts as a major site for hematopoietic stem cell (HSC) maturation and expansion, whereas HSCs in the adult bone marrow (ABM) are largely quiescent. HSCs in the FL possess faster repopulation capacity as compared with ABM HSCs. However, the molecular mechanism regulating the greater self-renewal potential of FL HSCs has not yet extensively been assessed. Recently, we published RNA sequencing-based gene expression analysis on FL HSCs from 14.5-day mouse embryo (E14.5) in comparison to the ABM HSCs. We reanalyzed these data to identify key transcriptional regulators that play important roles in the expansion of HSCs during development. The comparison of FL E14.5 with ABM HSCs identified more than 1,400 differentially expressed genes. More than 200 genes were shortlisted based on the gene ontology (GO) annotation term "transcription." By morpholino-based knockdown studies in zebrafish, we assessed the function of 18 of these regulators, previously not associated with HSC proliferation. Our studies identified a previously unknown role for tdg, uhrf1, uchl5, and ncoa1 in the emergence of definitive hematopoiesis in zebrafish. In conclusion, we demonstrate that identification of genes involved in transcriptional regulation differentially expressed between expanding FL HSCs and quiescent ABM HSCs, uncovers novel regulators of HSC function.status: publishe

    Additional file 2: of Transcriptomic dynamics of breast cancer progression in the MMTV-PyMT mouse model

    No full text
    Table S1–S4. DEGs from the four stages. Table S5. Genes in the WGCNA modules. Table S6. RNA expression and protein staining scores from The Human Protein Atlas. (XLSX 181 kb
    corecore